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Abstract. There are a huge number of problems, from various areas, being solved by 
reducing them to SAT. However, for many applications, translation into SAT is performed 
by specialized, problem-specific tools. In this paper we describe a new system for uni- 
form solving of a wide class of problems by reducing them to SAT. The system uses a 
new specification language URSA that combines imperative and declarative programming 
paradigms. The reduction to SAT is defined precisely by the semantics of the specification 
language. The domain of the approach is wide (e.g., many NP-complete problems can be 
simply specified and then solved by the system) and there are problems easily solvable 
by the proposed system, while they can be hardly solved by using other programming 
languages or constraint programming systems. So, the system can be seen not only as 
a tool for solving problems by reducing them to SAT, but also as a general-purpose con- 
straint solving system (for finite domains). In this paper, we also describe an open-source 
implementation of the described approach. The performed experiments suggest that the 
system is competitive to state-of-the-art related modelling systems. 



Following spectacular advances made over the last years, the SAT solving technology has 
many successful applications nowadays — there is a wide range of problems being solved 
by reducing them to SAT. Very often solving based on reduction to SAT is more efficient 
than using a problem-specific solution. Therefore, SAT solvers are already considered to 
be a Swiss army knife for solving many hard CSP and NP-complete problems and in many 
areas including software and hardware verification, model checking, termination analysis, 
planning, scheduling, cryptanalysis, electronic design automation, etc. [6l [3l [12l [251 EH SHI 
I50j . Typically, translations into SAT are performed by specialized, problem-specific tools. 
However, using a general-purpose system capable of reducing a wide range of problems to 
SAT can simplify this task, make it less error prone, and make this approach more easily 
accessible and more widely accepted. 

There are already a number of approaches for solving combinatorial and related prob- 
lems by general-purpose systems that reduce problems to underlying theories and domains 
(instead of developing special purpose algorithms and implementations). A common moti- 
vation is that it is much easier to develop a problem specification for a general system than a 
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new, special-purpose solver. The general problem solving systems include libraries for gen- 
eral purpose programming languages, but also modelling and programming languages built 
on top of specific solvers [361 1331 1121 El- Most modelling languages are highly descriptive 
and have specific language constructs for certain sorts of constraints. Specific constraints 
are translated to underlying theories by specific reduction techniques. Some modelling sys- 
tems use SAT as the target problem [31^ [35] and some of them focus on solving NP-complete 
problems by reduction to SAT |10j . 

In this paper we present a novel approach for solving problems by reducing them to 
SAT. The approach can be seen also as a general-purpose constraint programming system 
(for finite domains). The approach consists of a new specification/modelling language URSA 
(from Uniform Reduction to SAt) and an associated interpreter. In contrast to other mod- 
elling languages, the proposed language combines features of declarative and imperative 
programming paradigms. What makes the language declarative is not how the constraints 
are expressed, but only the fact that a procedure for finding solutions does not need to be 
explicitly given. On the other hand, the system has features of imperative languages and 
may be seen as an extension of the imperative programming paradigmQ similarly as some 
constraint programming systems are extensions of logic programming paradigm. In contrast 
to other modelling languages, in the proposed specification language loops are represented 
in the imperative way (not by, generally more powerful, recursion), destructive updates are 
allowed, there is support for constraints involving bitwise operators and operators for arith- 
metic modulo 2". There are problems for which, thanks to these features, the modelling 
process is simpler and specifications are more readable and easier to maintain than for other 
languages and constraint systems. However, of course, the presented system does not aim 
to replace other constraint systems and languages, but rather to provide a new alternative 
with some distinctive features. 

The used uniform approach enables a simple syntax and semantics of the URSA language, 
a simple, uniform reduction to SAT and, consequently, a simple architecture of the whole 
system. This enables a straightforward implementation of the proposed system and a rather 
straightforward verification of its correctness. This is very important because, although it 
is often easier for a declarative program than for a corresponding imperative program to 
verify that it meets a given specification, this still does not lead to a high confidence if the 
constraint solving system itself cannot be trusted. 

The presented approach is accompanied with an open-source implementation, publicly 
available on the Internet. A limited experimental comparison suggest that the system 
(combined with state-of-the-art SAT solvers) yields good performance, competitive to other 
modern approaches. 

Overview of the paper. In Section [2] we give relevant definitions; in Section [3] we provide 
motivation and basic ideas of the proposed approach. In Section H] we describe the specifi- 
cation language URSA, in Section[5]its semantics, in Section[6]the corresponding interpreter, 
and in Section [7] pragmatics of the language. In Section [8] we discuss related techniques, 
languages and tools. In Section [9] we discuss directions for future work and in Section [10] 
we draw final conclusions. 



^There are constraint programming libraries for imperative languages, but these still do not follow the 
spirit of imperative programming and are substantially declarative. 
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2. Background 

In this section we give a brief account of the SAT and CSP problems and related notions. 

Propositional logic. We assume standard notions of propositional logic: literal, clause, propo- 
sitional formula, conjunctive normal form (cnf), valuation (or assignment), interpretation, 
model, satisfiable formula, etc. We denote by T and _L the Boolean constants true and 
false and the logical connectives by -> {negation), A {conjunction), V {disjunction), © {ex- 
clusive disjunction), =^ {implication), <^ {equivalence). Two formulae A and B are said to 
be equivalent if A and B have the same truth value in any valuation. Two formulae A and 
B are said to be weakly equivalent (or equisatisfiable) if whenever A is satisfiable then B is 
satisfiable and vice versa. 

Constraint Satisfaction Problem. A constraint satisfaction problem (cSP) is defined as a 
triple {X,D,C), where X is a finite set of variables xi, X2, . . ., x^, is a set of domains 
di, d2, . . ., dn for these variables, and C is a set of constraints ci, C2, ■ ■ ., Ck- In a finite- 
domain CSP, all sets from D are finite. Constraints from C may define combinations of 
values assigned to variables that are allowed or that are prohibited. A problem instance 
is satisfiable if there is an assignment to variables such that all constraints are satisfied. 
Such assignment is called a solution. A constraint optimization problem is a CSP in which 
the goal is to find a solution maximizing (or minimizing) a given objective function over all 
allowed values of the given variables. 

SAT Problem and SAT Solvers. SAT is the problem of deciding if a given propositional formula 
in CNF is satisfiable, i.e., if there is any assignment to variables such that all clauses are true. 
Obviously, SAT is a special case of CSP, with all variables ranging over the domain {0, 1} and 
with constraints given as clauses. SAT was the first problem shown to be NP-complete |15j . 
and it still holds a central position in the field of computational complexity. Stochastic SAT 
solvers cannot prove the input instance to be unsatisfiable, but may find a solution (i.e., a 
satisfying variable assignment) for huge satisfiable instances quickly. On the other hand, 
for a given SAT instance, a complete SAT solver always finds a satisfying variable assignment 
or shows that there is no such assignment. Most of the state-of-the-art complete SAT 
solvers are cdcl (conflict-driven, clause-learning) based extensions of the Davis-Putnam- 
Logemann-Loveland algorithm (dpll) [18 \ \17 \ [3]. In recent years, a tremendous advance 
has been made in SAT solving technology [52 ^ 122 ^ \75 \ [3]. These improvements involve both 
high-level and low-level algorithmic techniques. The advances in SAT solving make possible 
deciding satisfiability of some industrial SAT problems with tens of thousands of variables 
and millions of clauses. 

3. Problem Specification and Problem Solving 
There are two basic components of the presented approach: 

• problem specification: a problem is specified by a test (expressed in an imperative form) 
that given values of relevant variables are indeed a solution to the problem. 

• problem solving: all relevant variables of the problem are represented by finite vectors of 
propositional formulae (corresponding to vectors of bits and to binary representation, in 
case of numerical values); the specification is symbolically executed over such represen- 
tation and the assertion that given values make a solution is transformed to an instance 
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of the SAT problem and passed to a SAT solver. If the formula is satisfiable, its model is 
transformed back to variables describing the problem, giving a solution to the problem. 

3.1. Problem Specification. Let us consider problems of the following general form: find 
( if it exists ) a set of values S such that given constraints are met (variations of this form 
include: only checking if such values exists, and finding all values that meet the given 
conditions). A problem of this form can be specified by a test that checks if a given set S 
meets the given constraints (with one assertion that combines all the constraints). The test 
can be formulated in a language designed in the style of imperative programming languages 
and such a test is often easy to formulate. 

Example 3.1. Let us consider a trivial problem: if v equals u+1, find a value for u such 

that V equals 2. A simple check in an imperative form can be specified for this problem 
— if a value of u is given in advance, one could easily check whether it is a solution of the 
problem. Indeed, one would assign u + \iov and finally check whether v equals 2. Such test 
can be written in the form of an imperative C-like code (where assert (b) checks whether 
b is true) as follows: 

V = u+1; 
assert (v==2) ; 

The example above is trivial, but specifications may involve more variables and more 
complex operations, including conditional operations and loops, as illustrated by the fol- 
lowing example. 

Example 3.2. The most popular way of generating pseudorandom numbers is based on 
linear congruential generators. A generator is defined by a recurrence relation of the form: 

Xn+i = {axn + c) (mod m) (for n > 0) 
and xo is the seed value (0 < xq < m). One example of such relation is: 

Xn+i = (1664525x„ + 1013904223) (mod 2^2) (for ra > 0) 
It is trivial to compute elements of this sequence. The check that xioo is indeed equal to the 
given value if the seed is equal to nseed can be simply written in the form of an imperative 
C-like code as follows (assuming that numbers are represented by 32 bits): 

nx=nseed; 

for(ni=l;ni<=100;ni++) 

nx=nx*1664525+1013904223; 
assert (nx==3998113695) ; 

However, the following problem, realistic in simulation and testing tasks, is a non-trivial 
programming problem (unless problem-specific, algebraic knowledge is used): given, for 
example, the value xiqq compute xq. Still, the very same test shown above can serve 
as a specification of this problem. This example illustrates one large family of problems 
that can be simply specified using the proposed approach — problems that are naturally 
expressed in terms of imperative computations and that involve destructive assignments. 
Such problems are often difficult to express using other languages and systems. For the 
above specification, since in constraint programming systems the destructive assignment is 
not allowed, in most specification languages one would have to introduce variables for all 
elements of the sequence from xq to xioo and the constraints between any succeeding two. 
Also, other systems typically do not support modular arithmetic constraints and integers 
of arbitrary length. 
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Note that the specifications given above also cover the information on what variables 
are unknown and have to be determined so that the constraints are satisfied — those are 
variables that appear within commands before they were defined. So, the above code is 
a full and precise specification of the problem, up to the domains of the variables. For 
Boolean variables, the domain is {_L,T}, while for numerical variables a common domain 
interval (e.g., [0,2" — 1]) can be assumed and additional domain restrictions can be given 
within the specification. 

3.2. Problem Solving. The described imperative tests form problem specifications and 
they can be used as a starting point in problem solving. Let us first describe a straightfor- 
ward, naive approach. 

Assume that there is a problem specification (in the form of an imperative test) and 
assume there is a common domain for all unknowns (except Boolean unknowns) , for exam- 
ple, the interval [0, 2" — 1] (for a given n). Then, for all admissible values for all unknowns, 
the specification can be executed. All sets of values satisfying the constraints should be 
returned as solutions. If there are k unknown numerical variables and / unknown Boolean 
variables, then the search space would be of the size 2"'^"'"^ 

Example 3.3. Let us consider the specification given in Example 13.11 If the domain for 
u and V is the interval [0,3], the specification should be executed four times and only the 
value 1 assigned to u leads to the constraint met, so it is the only solution of the problem. 

Obviously, the above naive and brute-force approach based on systematic enumeration 
of all possible input values is complete (for finding all solutions), but extremely inefficient. 
It can be turned to a much more efficient version that takes into account given relationships 
between variables in order to reduce the search space. The basic idea is to represent all 
unknowns abstractly, in a symbolic form, as vectors of propositional formulae. Then, all 
steps in the specification can be performed using this abstracted form (i.e., can be symbol- 
ically executed). Finally, the assertion would generate a propositional formula for which 
a satisfying valuation is to be found. If there are k unknown numerical variables and / 
unknown Boolean variables, then the number of possible valuations would be 2"*''"*"' (if the 
interval [0,2" — 1] is assumed as the domain for numerical values). Of course, instead of 
a brute-force search over this set of valuations, a SAT solver should be used (and it will 
typically perform many cut-ofFs and search over just a part of the whole search space). 

Representation of numerical variables by propositional formulae corresponds to their 
binary representation. Each formula corresponds to one bit of the binary representation. 
If the range of a numerical variable is [0,2" — 1], then it is represented by a vector of n 
propositional formulae. If a bit of the number is known to be 1, then the corresponding 
formula is T, and if a bit of the number is known to be 0, then the corresponding formula is 
_L. For instance, for n = 2, 1 is represented by [-L,T] (where the last position corresponds 
to the least significant bit). If a bit of the number is not known, then it is represented by a 
propositional variable, or, if it depends on some conditions, by a propositional formula. We 
will discuss only representations of unsigned integers, but representations of signed integers 
can be treated in full analogy (moreover, fioating point numbers can also be modelled in 
an analogous way). Boolean variables are represented by unary vectors of propositional 
formulae. 

Results of arithmetic and bitwise logical operations over numbers represented by vec- 
tors of formulae can be again represented by propositional formulae in terms of formulae 
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occurring in the input arguments. If the numbers are treated as unsigned, all arithmetic 
operations are performed modulo 2". For instance, if u is represented by [p, q] and v is rep- 
resented by [r, s], then v + u (modulo 2^) is represented by [{p(Br) (B {q A s) , q ® s]. Relational 
operations over numbers (=, <, >, <, >, 7^, etc.) and logical operations and relations over 
Boolean values can also be represented. For instance, if u is represented by \p, q] and v is 
represented by [r, s], then u = v is represented by the unary vector [{p 44> r) A (g 44> s)]. Note 
that representations of all standard arithmetic. Boolean, and relational operations produce 
polynomial size formulae. 

If a problem specification is executed over the variables represented by vectors of propo- 
sitional variables and using the corresponding interpretation of involved operations, then 
the assertion of the specification generates a propositional formula. Any satisfying valua- 
tion (if it exists) for that formula would yield (ground) values for numerical and Boolean 
unknowns that meet the specification, i.e., a solution to the problem. 

Example 3.4. Let us again consider Example 13. li If u is represented by [p, q] (and 1 is 
represented by [_L, T]), then, by the condition v=u+l, v is represented by [(p©_L)©(gAT), 
T], i.e., after simplification, by [pQq, "■<?]. From the assertion v==2, it follows that [p(Bq, ^q] 
should be equivalent to [T, _L]. In other words, the formula {{p(Bq) <^ T) A(-ig <^ _L) should 
be checked for satisfiability. It is satisfiable, and in its only model p maps to _L and q maps 
to T. Hence, the representation for a required value of u is [_L, T], i.e., u equals 1. 

3.3. Domain of the Approach. A system based on the ideas presented above, could be 
used not only for combinatorial problems, but for a very wide range of problems — it can 
be used for computing x such that /(x) = y, given y and a computable function / with 
a finite domain and a finite range (i.e., for computing inverse of /). A definition of / in 
an imperative form can serve as a specification of the problem in verbatim. Having such a 
specification of the function / is a weak and realistic assumption as it is easy to make such 
specification for many interesting problems, including NP-problems |26j . If / is a function 
such that /(x) = 1 when x is a witness for some instance of an NP-problem, / can serve as 
a specification for this problem and the required answer is yes if and only if there is x such 
that f{x) = 1. 

Concerning the type of numbers involved, the approach can be applied for any finite 
representation of signed or unsigned, integer or floating point numbers. 

In the proposed approach, all computations (over integers) are performed modulo 2"-. 
In the case of non-modular constraints, the base can be set to a sufficiently large value. 

The approach (in the presented form) cannot be used for computing x such that /(x) = 
y, for arbitrary computable function /. The first limitation is a finite representation of 
variables. The second is that conditional commands in the specification could involve only 
conditions with ground values at the time when the condition is evaluated. However, this 
restriction is not relevant for many (or most of) interesting problems. Overall, the domain 
of the proposed approach covers all problems with Boolean and numerical unknowns, over 
Boolean parameters and numerical parameters with finite domains, that can be stated in 
the specification language that makes the part of the approach. 
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4. Syntax of ursa Language 

In this section we describe the language URSA that serves as a specification language in the 
spirit of the approach presented above. A description of the syntax of the URSA language 
is given, in EBNF representation, in Table [J ((num var) denotes the syntactical class of 
numerical variables, (num expr) denotes the syntactical class of numerical expressions, (bool 
expr) denotes the syntactical class of Boolean expressions, etc). An URSA program is a 
sequence of statements (and procedure definitions). There are two types of variables — 
numerical, with identifiers starting with n and Boolean, with identifiers starting with b. The 
same convention holds for identifiers of arrays. Variables are not declared, but introduced 
dynamically. There are functions (bool2num and num2bool) for converting Boolean values 
to numerical values and vice versa, and the sgn function corresponding to signum function. 
Arithmetic, bitwise, relational and compound assignment operators, applied over arithmetic 
variables/expressions, are written in the C-style. For example, bitwise conjunction over 
numerical variables nl and n2 is written nl & n2, bitwise left shift of nl for n2 is written 
nl « n2, and nl += n2 is equivalent to nl = nl+n2. Logical operators, applied over 
Boolean variables/expressions, are written in the C-style, with additional operator for 
logical exclusive disjunction, in the spirit of other C logical operators. There are also 
compound assignment operators for logical operators, such as &&= (added for symmetry 
and convenience, although they do not exist in C). The operator ite is the conditional 
operator: ite(b,nl,n2) equals nl if b is true, and equals n2 otherwise. There are no 
user-defined functions, but only user-defined procedures. 

The role of the command assert is to assert that some constraint (given as the ar- 
gument) is met (as in C). During the interpretation of the URSA program, this command 
invokes the solving mechanism and seeks for an assignment to the introduced unknowns 
that make this constraint true. The command assert_all is analogous, but it seeks for all 
satisfying models. One program can have several commands of this sort, but each of them 
works locally (as in C), i.e., it invokes the solving mechanism just for its own argument. 
The instructions minimize (and maximize) state that a minimal (or maximal) value within 
the given range for the given numerical variable should be found. The commands assert 
and assert_all take into account only the last minimize/maximize instruction. 

There are miscellaneous commands, mostly intended to be used in interactive mode (for 
listing values and status of variables — listvars, for deleting values of all current variables 
— clear, and stopping the interpreter — halt). 

The following two examples illustrate some constructs of the specification language — 
the use of procedures and the use of the operator maximize. 

Example 4.1. For a given value k, the task is to find all values x, y, and z such that 
rj.k yk ^ 2:'^(mod 2"). The following URSA code (with a procedure that computes the 
power function) specifies the problem for k = 2: 
procedure power (na.nk) { 
np=na ; 

f or (ni=l ;ni<nk;ni=ni+l) 
na = na*np; 

} 

nk=2; 

nxpowernk=nx ; 
nypowernk=ny ; 
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(program) : 


= (procedure def)* (statement)* 


(procedure def) : 


= "procedure" (procedure name) ( "(" ")" I 

"(" ((num var id) I (bool var id)) ("," ((num var id) I (bool var id)))* ")" ) 
"{" (statement)* "}" 


(procedure name) : 


(letter) ((letter) I (digit))* 


(statement) : 


= (statement)* "}" 

1 (num var) (assign num op) (num expr) " ; " 
1 (num var) (num op postfix) 
1 (bool var) (assign bool op) (bool expr) ";" 
1 "while" "(" (bool expr) ")" (statement) 

1 "for" "(" (statement) (bool expr) ";" (statement) ")" (statement) 
1 ii-j^fii "(11 ^bool expr) ")" (statement) [ "else" (statement) ] 
1 "call" (procedure name) ( "(" ")" I 

"(" ((num expr) I (bool expr)) ("," ((num expr) I (bool expr)))* ")" ) 
1 "minimize" "(" (num var) "," (num const) "," (num const) ")" 
1 "maximize" "(" (num var) "," (num const) "," (num const) ")" ";" 
1 "assert" "(" (bool expr) (";" (bool expr))* ")" ";" 
1 "assert-all" "(" (bool expr) (";" (bool expr))* ")" ";" 
1 "print" ((num expr) I (bool expr)) ";" 
1 "listvars" ";" 
1 "clear" ";" 
1 "halt" ";" 


(num expr) : 


= (num const) 
1 (num var) 

1 (un num op) (num expr) 

1 (num expr) (num op) (num expr) 

1 ite "(" (bool expr) "," (num expr) "," (num expr)")" 

1 "sgn" "(" (num expr) ")" 

1 "bool2num" "(" (bool expr) ")" 

1 "(" (num expr) ")" 


(num var) : 


= (num var id) 

1 (num var id) " [" (num expr) "] " 

1 (num var id) "[" (num expr) "]" " [" (num expr) "] " 


(num const) 




= ( (digit) )+ 


(num var id) 




= "n" ((letter) I (digit))* 


(assign num op) 




= " = " 1 "+=" 1 "-=" 1 "H< = ii 1 "&=" 1 "1=" 1 "- = " 1 "«=" 1 "»=" 


(num op) 




= ■■+■■ 1 1 1 "Sc" 1 "1" 1 """ 1 "«" 1 "»" 


(un num op) 






(num op postfix) 




= "++" 1 "— " 


(num rel) 




_ II 1 II ^11 1 11^=11 1 ■1^=11 1 ii==<i 1 ■■1=11 


(bool expr) 




= (bool const) 
1 (bool var) 

1 (bool expr) (bool op) (bool expr) 

1 (un bool op) (bool expr) 

1 (num expr) (num rel) (num expr) 

1 ite "(" (bool expr) "," (bool expr) "," (bool expr)")" 
1 "num2bool" "(" (num expr) ")" 
1 "(" (bool expr) ")" 


(bool const) : 


= ( "true" 1 "false" ) 


(bool var) : 


(bool var id) 

1 (bool var id) " [" (num expr) "] " 

1 (bool var id) " [" (num expr) "]" "[" (num expr) "]" 


(bool var id) 




"b" ((letter) I (digit))* 


(assign bool op) 




= 1 "&&=" 1 " 1 |=" 1 "--=" 


(bool op) 




= "&&" 1 "11" 1 "--" 


(un bool op) 




= II 1 >■ 



Table 1: EBNF description of URSA language 
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nzpowernk=nz ; 

call power (nxpowernk,nk) ; 
call power (nypowernk,nk) ; 
call power (nzpowernk,nk) ; 

assert_all (nxpowernk+nypowernk==nzpowernk) ; 

Example 4.2. Given nine points in space, no four of which are coplanar, the task is to 
find the minimal natural number n such that for any coloring with red or blue of n edges 
drawn between these nine points there always exists a triangle having all edges of the same 
color 

We will slightly reformulate the problem: it is sufficient to find the maximal natural 
number n such that there is a coloring with red or blue of n edges drawn between these 
nine points such there is no triangle having all edges of the same color. We will assume that 
nE[i] [j] is if there is no edge linking i-th and j-th point, that nE[i] [j] equals 1 if the 
edge linking i-th. and j-th point is red, and that iiE[i] [j] equals 2 if the edge linking i-th 
and j-th point is blue. In the following specification, nNumberOf Edges stores the number of 
edges, bTwoColors stores the condition that for each pair of points, either there is no edge, 
or it is red or blue, and bNoMonochromaticTriangle stores the condition that there is no 
triangle having all edges of the same color. In order to use the maximal n that meets these 
conditions, the command maximize (n, 1 , 36) is used (there are 36 edges at most), 
maximize (n, 1 ,36) ; 
nPoints=9; 
nNumberOf Edges=0 ; 
bTwoColors=true ; 
bNoMonochromaticTriaiigle=true ; 

f or (ni=l ;ni<=nPoints-l ;ni++) 

f or(nj=ni+l;nj<=nPoints;nj++) { 
nNumberOf Edges += sgn(nE[ni] [nj] ) ; 
bTwoColors &&= nE [ni] [nj ] <3 ; 
f or (nk=nj+l ;nk<=nPoints ;nk++) 
bNoMonochromaticTriangle &&= 

(nE[ni] [nj]==0 I I nE [ni] [nj] ! =nE [ni] [nk] I I nE [ni] [nj] ! =nE [nj] [nk] ) ; 

> 

assert (nN\imberOfEdges==n && bTwoColors && bNoMonochromaticTriangle); 



5. Semantics of ursa Language 

The semantics of the URSA language is not equal, but rather parallel to the standard se- 
mantics of imperative programming languages. Namely, while in the standard semantics 
expressions (numerical and Boolean) are always evaluated to ground values, in URSA they 
may be represented in symbolic propositional form. In terms of operational semantics [59], 

^This is one of the problems from International Mathematical Olympiad (IMO) held in 1992. Mathemat- 
ical problems from IMOs are typically very challenging problems from different areas of mathematics, often 
coming from complex mathematical conjectures, but not requiring heavy mathematical devices themselves 
(as they are aimed at high-school students) 120'. A number of other IMO problems can be specified and 
solved using the presented system. 
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in the standard semantics, a store (intuitively, describing memory) is a function from iden- 
tifiers to integers, while for the URSA language, a store is a function from identifiers to 
integers or vectors of propositional formulae. Boolean variables are represented by unary 
vectors, while numerical variables are represented by vectors of length n (where n is chosen 
in advance and then fixed for one session of the interpreter's work). 

A configuration is a pair (c, s) of a command or an expression c and a store s. A one 
step relation i-^ maps configurations to configurations. In the following rule descriptions, 
{skip, s) denotes a terminal configuration — a program that has completed execution, x 
denotes a variable identifier, i, ii, and 12 denote integers, fi, and f" denote propositional 
formulae, e, e', ei and 62 denote expressions, and s 1+1 (x y) denotes a function s such 
that s{x) = y and s{z) = s{z) for z ^ x. 

In the standard semantics, for example, the assignment operator = is defined by the 
rules: 

(1) {x = i,s) I-)- {skip, s \+l {x i-^ i)) 

^2) (e,s) ^ {e',s') 

(x = e, s) {x = e',s') 

Since in the URSA language a variable can be assigned both a ground integer value or 
a vector of propositional formulae, the semantics of the assignment operator in URSA is 
defined by the rules (the rule (1) from above is split into two rules): 

(I'a) {x = i,s) i-> {skip,s l+J (x i-> i)) 

(I'b) {X=[fi,...,fn],s) ^ {skip,S \S {X^[fl,...,fn])) 

/2'\ (e,s) ^ {e',s') 

(x = e,s) I— {x = e',s') 

If, in an assignment command, the right hand side is evaluated to a ground integer 
(i.e., if the rule (I'a) has been applied), then the variable on the left hand side gets the 
status ground. If the right hand side is evaluated to a vector of formulae (i.e., if the rule 
(I'b) has been applied), then the variable on the left hand side gets the status symbolic and 
dependent. 

In the standard semantics, for an operator o over integers, the relation 1-^ is defined for 
expressions by the following four rules: 

(3) {x,s) ^ {s{x),s) 

(4) {ii 012,3) I—)- {i,s) (where i equals ii 0^2) 
(ei,s) ^ {e'i,s') 



(5) 
(6) 



(ei oe2,s) ^ (e'l oe2,s') 
(62, s) ^ {e2,s') 



{ioe2,s) 1-^ {ioe'2,s') 

The corresponding rules for the URSA language are substantially different. Regarding 
the rule (3), in standard imperative programming languages, a statement attempting to 
evaluate an undefined variable (one that is not in the domain of the current store) results 
in a runtime error. On the other hand, in the URSA language, undefined variables can be 
used in expressions, so this introduces a non- standard rule (hrstead of the rule (3)B 



More precisely, rather than a rule, this is a rule-schema, with instances for each n. 
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(3') {x,s) ^ 



(s(x), s) if X is defined in s 

{[vi, . . . ,Vn], s [+J [ui, . . . , otherwise (uj's are fresh 

prepositional variables) 



If, for a variable x, the second case of the above rule has been applied, x gets the status 
symbolic and independent (and is represented by a vector of propositional formulae). In 
other words, this status is associated to each variable that is for the first time used not on 
the left hand side of an assignment operator. For example, if a variable nX was not defined 
beforehand, after the statement nY=nX;, both variables are to be represented by the same 
sequence of propositional variables, but nX will internally have the status independent, while 
nY will have the status dependent. Variables with the status symbolic and independent are 
those that should be determined in order to solve the given problem. 

Example 5.1. Let us consider the specification for the trivial problem from Example 13.11 

nv = nu+ 1 ; 
assert (nv==2) ; 

The variable nu is used for the first time within the first command, so it is symbolic 
(internally represented by a vector of propositional formulae) and independent. The variable 
nv becomes symbolic and dependent (it depends on nu). 

Since in URSA expressions (numerical and Boolean) may be represented in symbolic 
propositional form, the rule (4) is replaced by the following four rules: 

(4'a) {ii 0^2, s) I— >■ {i,s) (where i equals ii 0*2) 



where, in the last rule, [/f , . . . , /^'J is a vector of propositional formulae corresponding to 
[f I, fn] o [f [,..., fn], assuming that all numerical expressions are considered unsignecfl 
and all arithmetic operations are performed modulo 2". For example, assuming that num- 
bers are represented by vectors of length 2 (i.e., n = 2), the rule (4'd) for the operator + is 
as follows: 



As another example, assuming that numbers are represented by vectors of length 2 (i.e., 
n = 2), the operator > is defined as follows (in accordance with lexicographic ordering): 



By the above semantics, an expression involving symbolic expressions is also a symbolic 
expression. However, for efficiency reasons, if it is possible, an expression will be computed 
to a ground value. For instance, after the statement nA = nB * ; , the variable nA will 
have a value and after the statement bX = bY && false;, the variable bX will have the 

^ Dealing with representation of signed integers or floating point numbers can be, in principle, 
described and implemented by analogy (http: //www. inf ormatik.uni-bremen.de/~f lorian/sonolar/' 
|http: //www. cprover.org/SMT-LIB-Float/ [T3l[7]). StiU, as discussed by Brillout et al. [7], the bottleneck 
of a reducing floating point operations to propositional logic would be in the complexity of the resulting 
propositional formulae. 




{[fij2] + [f[J^],s) ^ ([(/ie/0e(/2A/^),/2e/^],5: 



{[flj2]>[fi,f^],s) ^ ([(/lA-/()V((/i^/()A(/2A-/^)],s) 
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(4"d) {[h,...,fn]o[f[,...,fk],s) ^ I 



ground value false, even if the variables bB and bY were symbolic. This is described by the 
modified version of the rule (4'd): 

{i, s) if all fi, . . ., are Boolean 

constants, whore i is a number 
with the binary representation 

[/") • • • ; fn\ 

([/('>• ••>/«] otherwise 

While the rule (5) from above is kept unchanged for the URSA language, the rule (6) is 
split into two rules, one for ground argument and one for symbolic argument: 

(ei,s) ^ (ei,gO 



(5') 
(6'a) 
(6'b) 



(eioe2,s) (eiOe2,s') 

(62, g) ^ (e!^,s') 

(i O 62 , .s) ^ {io 62 , s') 

(62, S) (62,5') 



(7) 
(8) 



([/l,---,/n]^e2,s) ^ {[fi,...,fn]oe'2,s') 

The semantics of while is defined by the following standard rules: 

{b,s) ^ {T,s') 

(while b do c, s) (c; while b do c, s') 

{b,s) ^ {±,s') 



(while b do c, s) i->- {skip, s') 

If, for the statement while b do c, neither of {b, s) 1— >■ T and {b, s) _L holds 
(i.e., if b is evaluated to a propositional formulae), then a run-time error is raised. The 
semantics of for is defined by analogy. This is a restriction of the specification language 
and it is difficult to overcome, as it would require indefinite loop unrolling. Recursion is 
also not supported and an index for accessing an array element has to be a ground number. 
A condition for the if statement has to evaluate to a ground Boolean value, too (otherwise 
a run-time error is raised). However, as a substitute, there is a conditional statement ite 
(for if-then-else) and its condition (the first argument) can be either ground or symbolic. 
Its semantics, if the last two arguments arc of the Boolean type, is defined in the following 
way (if the arguments are of the numerical type, it is defined for each vector element by 
analogy) : 

{ite{b,[bi],[b2],s) ^ ((6^61) A (-6^62), s) 

Arguments to procedures are passed by name if they are variables, and otherwise, they 
are passed by value. 

The semantics of other commands in the URSA language and its relationship with the 
standard semantics is obvious or analogous to the semantics of the commands given above. 

The above, clear and relatively simple semantics, enable a rather straightforward (al- 
though still tedious) verification (of correctness) of the proposed system: if the system 
returns a solution, then this solution indeed meets the given specification, and — if the 
system does not return a solution, then the specification has no solutions. The correctness 
property is given by the following theorem (sg denotes a function not defined for any argu- 
ment and I— > correspond both to the standard and the URSA semantics since there are no 
unknowns in the relevant specification). 
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Parsing and interpreting URSA statements 
I 

Generating a goal formula from the constraint 

I 

Transforming the goal formula to CNF 

i 

Invoking the SAT solver 
I 

Turning SAT models into values of the independent variables 



Figure 1: Overall architecture of URSA system 

Theorem 5.2. If the variables vi, V2, . . ., Vn are (the only) unknowns in an URSA specifica- 
tion S; assert{b);, then it leads to a solution {vi,V2, ■ ■ ■ , Vn) = (ci, C2, . . . , c„), (i.e. this solu- 
tion leads to the model of the formula s{b) ) iff [vi = ci;v2 = C2; . . . ;Vn = Cn] S; assert(b), s$) 
1-^ {skip,s) where s{b) is true. □ 

The above semantics also ensure faithfulness of problem specifications — each model 
meeting the specification leads to one model of the generated propositional formula, and each 
model of the generated propositional formula leads to one model of the input specification. 
This property is essential for constructing (or counting) all solutions of a problem. 

6. Interpreter for ursa Language 

In this section we describe the implementation of the interpreter for the URSA language!! 

The interpreter is implemented in the programming language C++. The whole system 
has a simple architecture and is relatively small (around 100Kb of source code, not counting 
SAT solver used) . The overall architecture of the system is illustrated in Figure [TJ 

6.1. Parsing and Interpreting URSA Statements. Statements can be entered in an 
interactive mode and processed one by one or provided to the URSA system within a file. 
Statements are interpreted according to the semantics described in Section [5l 

The table of symbols (i.e., the store) stores values of numerical and Boolean variables. 
Numerical variables are represented by a class (Number) whose objects can be either sym- 
bolic numbers (i.e., vectors of propositional formulae) or concrete numbers. For this class all 
standard arithmetic (except divisiorH), bitwise logical, and relational operators are defined. 
In principle, these operators return ground values for ground operands and vectors of for- 
mulae for symbolic operands (but, if possible, return ground values for symbolic operands). 
Boolean variables are represented by analogy (by the class Boolean). 

^The source code of the interpreter and example specifications are available within the distribution of the 
URSA tool, available online from: |http : //www.matf .bg. ac.rs/- janicic/ursa.zip[ 

^It is possible to define division in symbohc terms, but for its complexity, it is not implemented in 
the current version of the system. Still, division can be modelled within the system by using multipli- 
cation as shown in the following example, giving nD as a results of integer division nX/nY and nR as a 
remainder (the constraint on nD restricts the number of solutions, because of the modular arithmetic): 
assert (iiX==nY*nD+nR;nR<nY;nD<nX) ; 
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Symbolic values of variables are represented by vectors of propositional formulae. There 
are classes for dealing with propositional formulae (Formula) and vectors of propositional 
formulae (FormulaVector). The length of vectors of propositional formulae representing 
numerical variables is given as a parameter to the interpreter (the default value is 8, corre- 
sponding to integers from to 255). 

In the current implementation, all numerical variables and constants are treated as 
unsigned integers (represented by binary representation). For the class FormulaVector all 
standard arithmetic (except division), bitwise logical, and relational operators are defined. 
They deal only with symbolic numerical and Boolean values (in contrast to the class Number 
and Boolean). The implementations of bitwise logical operators are simple and straight- 
forward. The implementations of the arithmetic operators modulo 2" and the relational 
operators are more complex because resulting bits depend on all the previous bits of the 
operands. Figure [2] shows the implementation of the relational operator > (it processes 
propositional formulae corresponding to bits from the least significant one and returns a 
unary vector). Note that the given function is not computing a value for the relational op- 
erator > for concrete numbers, but it sets a formula that corresponds to > for two symbolic 
input numerical expressions. 

FormulaVectorl FormulaVector :: operator > (const FormulaVector &fv) { 
FormulaVectorl result, rl,r2; 
rl = (*this) [size-1] ; 
r2 = fv [size-1] ; 
result = (rl & ~r2) ; 
forCint i=size-2; i>=0; i — ) { 

rl = (*this) [i] ; 

r2 = fv[i] ; 

result = (rl & ~r2) I (result ft (rl==r2)); 

} 

return result; 

> 

Figure 2: Implementation of the operator >. 

For the sake of efficiency (both time and space), the technique of shared expressions 
was used. So, each subformula is stored only once, but there can be more references to it 
(from different formulae). All formulae generated during the interpretation of the program 
are stored by the class FormulaFactory in this way. Hence, the formulae are not stored 
individually, but in a form of a directed graph that stores links between them. Each formula 
that is not a propositional variable is represented by its connective and by references to its 
subformulae. Each formula is assigned a unique (numerical) identifier. 

Example 6.1. Figure [3] illustrates how the (artificial) formula {p /\{q /\ r)) V {{q A r) A ^p) 
and its subformulae are stored internally. 

6.2. Generating a Goal Formula from the Constraint. When executing the command 
assert, the interpreter invokes the underlying solving process. For the given constraint 
(given as an argument to assert), a corresponding (single) propositional formula is gener- 
ated, transformed to CNF and, by a SAT solver, it is checked whether the formula is true 
in some valuation. If yes, then the system lists values of all independent variables in that 
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Figure 3: Internal representation of formulae 



valuation. The command assert_all is similar, but it lists values of independent variables 
in all satisfying valuations. 

Transforming the formula F that corresponds to the constraint to CNF is performed 
using Tseitin transformation |70] which is linear in both space and time. The central idea 
of the transformation is to introduce new (,, definitional") variables for all subformulae of 
the input formula, to replace the subformulae with the corresponding variables, and to add 
conjuncts (in the form of clauses) representing ,, definitions" of newly introduced variables. 
This yields a resulting formula that is not logically equivalent to F (because of the newly 
introduced variables) but is equisatisfiable to it (i.e., the resulting formula is satisfiable if 
and only if F is satisfiable). Still, each model for the resulting formula gives one model for 
F and vice versa (thanks to the nature of definitional variables). This approach is suitable 
for the representation of formulae described in Section 16.11 since all subformulae of the 
formula representing the constraint are already generated and assigned unique identifiers 
that correspond to definitional variables. 

The problem with the Tseitin transformation is that it introduces many new variables, 
and consequently many clauses. There are techniques that can reduce the number of vari- 
ables and clauses, e.g., by using implications instead of equivalences for subformulae that 
occur in one polarity only [23]. In the current implementation of the URSA system, for re- 
ducing (in some cases) the number of variables and clauses, associativity and commutativity 
of the connectives A and V is usedQ If a formula A is of the form A A . . . A A„ there 
is no need to introduce new variables for each of n — 1 conjunctions. Indeed, the standard 
Tseitin set of resulting clauses for this case can be replaced by: 

{pA V ^pAi V -^PA2 V ... V -^PaJ a {^Pa VpAi) a {^PA V PA2) A . . . {^PA V PaJ 

Disjunctions are treated by analogy. In addition, if the initial formula is a conjunction, its 
conjuncts are put into CNF separately and conjoined. If any of those conjuncts is a clause, it 
is directly used (without transformation and definitional variables) at the conjunction top 
level [32]. The above modifications still keep the transformation linear. 

^ The fact that the connectives A and V are associative and commutative is used only when the formula 
is already generated. An alternative would be to deal with n-ary conjunctions and disjunctions in earlier 
stages, along the generation process. 
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Thanks to the above modifications, input instances of the SAT problem itself (repre- 
sented simply, in a way that is close to the dimacs format) are reduced by the URSA system 
to the same instances, up to renaming variables 

Typically, reducing the size of a formula is an important objective, but a smaller num- 
ber of clauses does not necessarily mean an easier formula and sometimes adding the right 
clauses (e.g., those corresponding to symmetry breaking) can have a positive effect to per- 
formance of the solver. Still, guessing and adding clauses that could speed-up the solving 
process is is not performed in the current version of URSA. 

6.3. Invoking the SAT Solver. The URSA system (currently) uses state-of-the-art SAT 
solvers ArgoSAtQ and claspIE] The solvers are called directly through appropriate func- 
tion calls. Alternatively, the solvers could accept inputs through files in the standard dimacs 
forn£] (ursa can be used just to translate the input problem to SAT, without solving it). 
In that case, the solvers can be used as a black-box, and could be replaced by any SAT 
solver that accepts this input format (and has an option to generate all models of the input 
propositional formula). 

Since during the transformation to CNF (described above) some variables are eliminated, 
it is necessary to rename all remaining variables (and update the numbers of variables). 
Namely, if some variables do not occur in the generated CNF, the solver would consider as 
different all models that differ only in the values of such variables. The basic, independent 
variables (those that correspond to independent URSA variables) are never eliminated, even 
if they don't occur in the generated formula (because their values in all models are relevant). 

Along with the generated SAT instance, URSA can also pass to the solver the information 
on which propositional variables are independent and which are dependent. Some SAT 
solvers can use this information to guide (and hopefully speed-up) the solving process. If 
the constraint is given by the assert_all command, the solver is invoked to generate all 
satisfying valuations0 When multiple invocations of the solver are used (for finding all 
solutions), the solving process can benefit from clauses learnt in previous invocations. 

If there is a statement minimize (or maximize) used, in the current implementation, 
the problem is solved sequentially for all values from the given range assigned to a relevant 

Because of the possible renaming of the variables, solving the same SAT instance by URSA and by the 
underlying solver would not necessarily take the same time as the solving process can take different routes. 

^ArgoSAT ( |http: //cLTgo .matf .bg. ac .rs/downloads .html| ) is an open-source, flexible, and verified SAT 
solver 

^•^CLASP | |http : //www ■ cs . uni-potsdam . de/ claspT^ ) is a solver for (extended) normal logic programs [27 . 
ft combines high-level modelling capacities of ASP with state-of-the-art techniques from the area of SAT 
solving and it can be used as an ASP solver or a SAT solver. CLASP was a winner at the ASP Competition 
2009 in several categories (http://www.cs.kuleuven.be/-dtai/events/ASP-competition/Results.shtml) 
and a winner at the SAT Competition 2009 (http: //www, sat competition, org/} in categories crafted SAT 
-l-UNSAT and Crafted SAT. 

^'^In DIMACS form, the number of variables A'^ and the number of clauses L are given first, and are followed 
by the list of clauses. The variables are represented by natural numbers (from 1 to N) and their negations 
are represented by corresponding negative numbers (from —1 to —N). More details about the DIMACS format 
can be found online: http : //www. satlib. org/Benchmarks/SAT/satf ormat .ps' 

^■^If the solver does not have a feature for generating all satisfying valuations, a simple approach with 
blocking clauses and successive calls to the solver can be used (a blocking clause, added to the set of initial 
clauses, consists of all literals defining one model negated — this way, a model once generated will not be 
repeated. 
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variable — from the minimal (maximal) element in the range onwards, seeking for a minimal 
(maximal) value that meets the constraint. 

6.4. Turning SAT Models into Values of the Independent Variables. If the SAT 

solver finds a satisfying valuation for the input formula, that valuation is used for comput- 
ing values of independent URSA variables. For this, only the basic variables are relevant 
(and not the definitional variables introduced during transformation to CNf). Each satisfy- 
ing valuation determines a vector of fixed Boolean values that correspond to an independent 
variable. The numerical (and Boolean) values are trivially computed from such representa- 
tions and returned by the URSA system. 

7. Pragmatics of ursa Language 

Representation of symbolic values natively used in URSA corresponds to binary represen- 
tation of unsigned integers, but the specification language is expressive enough and leaves 
enough freedom for modelling problems in substantially different ways and in different 
encoding styles, leading to simpler or more efficient solutions. For illustrating this, a pro- 
totypical CSP example — the queens problem is considered. The problem is to place N 
chess queens on an x chessboard such that none of them is able to capture any other 
(following the standard chess queen's moves). For modelling this problem, in the following 
text we do not use symmetry breaking or other similar additional constraints, but only the 
basic formulations of the problem. 

Since each row (denoted by numbers to A?^ — 1) of the board can have exactly one 
queen on it, the problem is, simply reformulated, to determine, for each row, a column for 
one queen to be placed. Hence, a solution to the problem is a sequence rg, ri, . . ., rjv_i 
that meet the given constraints. If such sequence is given in advance, it can be simply 
checked whether it is indeed a solution of the problem. The next URSA program, followed 
by a fragment of the output, specifies the problem (for the problem size 8; in this and other 
URSA specifications that follow, the dimension of an instance can be trivially changed in 
one line). 

***** queens-l ***** */ 
nDim=8 ; 

bDomain = true; 
bNoCapture = true; 

for(ni=0; nKnDim; ni++) { 
bDomain &&= (n [ni] <nDiin) ; 
for(nj=ni+l; nj<nDim; nj++) 

bNoCapture &&= n[ni] !=n[nj] && ni+n[nj] !=nj+n[ni] && ni+n[ni] !=nj+n[nj] ; 

> 

assert _all (bDomain && bNoCapture); 

********* URSA Interpreter ********* 
************************************* 

— > Solution 1 
n[0] =0 
n[l]=6 



18 



P. JANICIC 



n [2] =3 
n[3]=5 
n[4]=7 
n[5]=l 
n [6] =4 
n[7]=2 



— > Solution 92 

n [0] =3 

n[l]=l 

n[2]=4 

n[3]=7 

ii[4]=5 

n[5] =0 

n[6] =2 

n[7]=6 

[Formula generation: Os; conversion to CNF: 0.01s; total: 0.01s] 
[Solving time: 0.08s] 

[Formula size: 841 variables, 3352 clauses] 

In the above specification (referred to as to queens-1 in tfie following text), i-th row 
(i = 0, . . . , iV — 1) is associated with the i-th element the array n. In each row there should 
be one queen and n[i] is equal to the column in which that queen is placed. The variable 
bDomain encodes the condition that each n[i] is between and — 1 and the variable 
bNoCapture encodes the condition that there are no two queens that attach each other. 
If the numbers are represented by vectors of length I, for each problem instance there are 
IN basic variables in the propositional formula generated. For < 16, the numbers (and 
intermediate results such as ni+n[nj]) in the above specification can be represented by 5 
bits. Table [2] shows experimental results for the above specification for the instance sizes 
= 1, 2, 3, . . ., 12 (instances for which all solutions were found within 600s), including 
the number of solutions, the number of variables and clauses in the generated propositional 
formula, and the time speniF^ for finding all solutions. The time spent for finding the first 
solution was less than 0.05s for each problem instance. The time spent for generating the 
formulae was less than 0.01s for each problem instance and was negligible compared to the 
solving time. This shows that the used generation mechanism is rather efficient. For these 
problems (as well as for many other CSP problems), the ratio of the number of clauses and 
the number of variables in the generated formulae, gets rather stable (as the size of the 
instance increases) and it reflects the problem constrainedness |29j . 



dimension 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


number of solutions 


1 








2 


10 


4 


40 


92 


352 


724 


2680 


14200 


number of variables 


5 


44 


115 


209 


331 


480 


667 


841 


1052 


1286 


1560 


1819 


number of clauses 


5 


149 


418 


794 


1274 


1869 


2612 


3352 


4217 


5179 


6295 


7390 


all solutions 














0.01 


0.02 


0.14 


0.08 


0.61 


3.12 


19.25 


116.78 



Table 2: Experimental results for the queens-1 specification applied for = 1, . . . , 12 



Reported experimental results were obtained on a PC computer with Intel Celeron 420 1.60Ghz, 1Gb 
RAM, for URSA using clasp as the underlying SAT solver. 
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The above data are obtained using numbers represented by vectors of length 5. However, 
using the minimal vectors length is not critical — if a larger vector size is used, only trivial 
constraints are added and they have a very small impact on the solving process. Table 
[3] shows results for the above specification of the queens problem for instance size 10, for 
vectors length increasing from 5 to 12. The time for generating formulae slightly increases, 
the number of variables and clauses increases significantly, but the time spent for solving 
remains about the same for all vector lengths. 



number of bits 


5 


6 


7 


8 


9 


10 


11 


12 


variables 


1286 


1611 


1936 


2261 


2586 


2911 


3236 


3561 


clauses 


5179 


6604 


8119 


9724 


11419 


13204 


15079 


17044 


generating 


0.02 


0.02 


0.02 


0.03 


0.03 


0.04 


0.04 


0.04 


solving 


3.12 


3.26 


3.38 


3.19 


3.37 


3.13 


3.22 


3.23 



Table 3: Experimental results for the queens-1 specification applied (for finding all solu- 
tion) for A'^ = 10 and for number representations using 5, 6, . . . , 12 propositional 
formulae 

The same problem can be represented in URSA in other ways as well. For instance, 
the above specification can be slightly modified so it uses bit-wise operators, instead of 
arithmetic operators. Each row of the table can be represented as a A'^-digits number, i.e., 
by an element of an array n. Each of these numbers should have exactly one 1 in its binary 
representations (which can be checked in the way described in Example 18. ip . and at each 
position exactly one of ii[i] should have 1. The remaining (diagonal) no-attack conditions 
(for pairs of positions represented by indexes (Ax,Ay) and (Bx,By)) can be also expressed by 
using bit-wise operations. A corresponding URSA specification (referred to as to queens-2 
in the following text) is as follows (it should be used with numbers represented as vectors 
of length A^): 

/* ***** queens-2 ***** */ 
nDim = 8; 

bHorizontal = true; 
for(ni=0; ni<nDim; ni++) 

bHorizontal kk= ((n[ni] & n[ni] -1)==0) kk (n[ni]!=0); 

nVertical = 0; 
for(ni=0; ni<nDim; ni++) 

nVertical |= n[ni] ; 
bVertical = (nVertical+1 == 0) ; 

bDiagonal = true ; 
for(nAi=0; nAi<nDim-l; nAi++) 
for(nAj=0; nAj<nDim; nAj++) 

for(nBi=nAi+l; nBi<nDim; nBi++) 
for(nBj=0; nBj<nDim; nBj++) 

if (nBi-nAi==nBj-nAj I I nBi-nAi==nAj-nBj ) 

bDiagonal kk= ( ( (n[nBj] «(nBi-nAi) ) k n[nAj])==0); 

assert_all (bHorizontal kk bVertical kk bDiagonal) ; 

The queens problem can be represented also in the spirit of the direct encoding. Namely, 
each position in the table is associated with one Boolean variable, so b[i] [j] is set if and 



20 



P. JANICIC 



only if the position (i, j) is occupied by a queen. It has to be ensured that in each row there 
is exactly one queen and it has to be ensured that in each column there is at least one queen 
(this is sufficient if the former condition is satisfied). No-attack conditions arc expressed in 
a straightforward manner. A corresponding URSA specification is as follows (the numerical 
values could be represented by any vector length that can accommodate nDim): 
/* ***** queens-3 ***** */ 
nDim = 8; 

bHorizontal = true; 
for(ni=0; ni<nDim; ni++) { 
bOne = false; 
bMoreThanOne = false; 
for(nj=0; nj<rLDim; rLj++) { 

bMoreThanOne I |= bOne && b[ni] [nj] ; 
bOne I 1= b[ni] [nj] ; 

> 

bHorizontal &&= bOne && ! bMoreThanOne ; 

} 

bVertical = true; 
for(ni=0; ni<nDim; ni++) { 

bOne = false; 

for(nj=0; nj<nDim; nj++) 
bOne 11= b[nj] [ni] ; 

bVertical &&= bOne; 

} 

bDi agonal = true; 
for(nAi=0; nAi<nDim-l; nAi++) 
for(nAj=0; nAj<nDim; nAj++) 

for(nBi=nAi+l; nBi<nDim; nBi++) 
for(nBj=0; nBj<nDim; nBj++) 

if (nBi-nAi==nB j -nA j I I nBi-nAi==nA j -nB j ) 

bDiagonal &&= ( ! b [nAi] [nA j ] II ! b [nBi] [nB j ] ) ; 

assert_all (bHorizontal && bVertical && bDiagonal); 

It seems that the four-fold loop in the above specification can be a source of inefficiency. 
Indeed, instead of going through all possible values for nBi and only then checking if the 
corresponding positions should be tested for attack condition, one can calculate and consider 
only relevant coordinates nB j , as in the following modified specification: 
/* ***** queens-4 ***** */ 
nDim = 8; 

bHorizontal = true; 
for(nx=0; nx<nDim; nx++) { 
bOne = false; 
bMoreThanOne = false; 
for(ny=0; ny<nDim; ny++) { 

bMoreThanOne I I = bOne && b [nx] [ny] ; 
bOne II = b [nx] [ny] ; 

} 

bHorizontal &&= bOne && ! bMoreThanOne ; 

} 
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bVertical = true; 
for(ny=0; ny<nDim; ny++) {. 

bOne = false; 

for(nx=0; nx<nDim; nx++) 
bOne I I = b [nx] [ny] ; 

bVertical &&= bOne; 

} 

bDiagonal = true; 

for(nAx=0; nAx<nDim-l ; nAx++) 

f or(nBx=nAx+l; nBx<nDim; nBx++) { 

for(nAy=0; nBx-nAx+nAy<nDim; nAy++) { 
nBy=nBx-nAx+nAy ; 

bDiagonal &&= ( ! b [nAx] [nAy] I I ! b [nBx] [nBy] ) ; 

y 

f or (nAy=nBx-nAx; nAy<nDim; nAy++) { 
nBy=nAy- (nBx-nAx) ; 

bDiagonal &&= ( ! b [nAx] [nAy] I I ! b [nBx] [nBy] ) ; 

} 

> 

assert_all(bHorizontal && bVertical && bDiagonal); 

However, this modification does not improve efficiency of the solving process. Namely, 
aU constraints that are generated by the former and not by the later specification are 
trivially discarded and the final generated formulae are identical in the two cases. The 
formula generation is, still, somewhat more efficient in the latter case, but that gain gets 
insignificant as the problem instance grow. 

Table U] shows experimental results for the four given representations of the queens 
problem. For the first representation numbers are represented by vectors of length 5, for 
the second, numbers are represented by vectors of length equal to the instance size, and 
for the last two, numbers are represented by vectors of length 4. The table shows times 
(in seconds) for finding all solutions, while the timeout was set at 600s. The ffi'st problem 
representation was clearly shown to be the least efficient, while modified version (the second 
representation) was the best. As expected, the third and the fourth specification were almost 
equal. According to the data, it appears that the second specification was most efficient. 
However, dealing with output of large number of variables takes a larger portion of time 
for the last two specifications than for the second one. If used in a "quiet" mode (without 
listing values of independent variables) , the last two specifications are slightly more efficient 
than the second one. In summary, the specifications based on the bitvector operators and 
on the direct encoding were more efficient. Of course, even a conclusion about the most 
efficient sort of encoding for the queens problem should rely on a wider set of specifications 
and experiments, not to mention considering most efficient sorts of encodings generally. 
Although some encodings typically perform better than some other ones, a full picture of 
relative quality of encodings is very complex and each popular encoding has problems where 
it performs well. Hence, one of the points of presenting different specifications of the queens 
problem is not to locate the best encodings, but rather to demonstrate that in the URSA 
system one can make significantly different representations of the same problem. 
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Dimension 


solutions 


queens- 1 


queens- 2 


queens- 3 


queens-4 


8 


92 


0.08 


0.08 


0.09 


0.03 


9 


352 


0.61 


0.16 


0.15 


0.09 


10 


724 


3.12 


0.37 


0.32 


0.27 


11 


2680 


19.25 


1.18 


1.17 


0.92 


12 


14200 


116.78 


4.92 


5.52 


5.15 


13 


73712 




26.47 


30.85 


30.48 


14 


365596 




164.19 


198.06 


190.45 



Table 4: Experimental results of URSA applied on the queens problem for N = 1, . . . , 14 
(for finding all solutions) and for four different specifications 

8. Comparison to Related Techniques, Languages and Tools 

In this section we discuss tools and techniques related to the presented approach. We 
comment on symbolic execution, on constraint solvers, and on reducing problems to SAT. 
Finally, we present results of a limited experimental comparison between URSA and several 
other systems. 

8.1. Symbolic execution. Operation of URSA is related to symbolic execution. In sym- 
bolic execution [40l[57], program inputs are represented by symbolic values rather than by 
concrete data and the values of program variables are represented as symbolic expressions. 
The program is executed by manipulating expressions involving the symbolic values and, 
as a result, the output values are expressed as a function of the input symbolic values. 
Symbolic execution has been proposed for software verification over three decades ago, and 
recently it gained a renewed interest. The verification tools using symbolic execution in- 
clude systems like Java PathfindeS [56], Pefi [68j, VigilantfEl [16jJ. Some of these tools use 
SAT and SMT solvers, but they typically handle only machine data-types (and not arbitrary 
bit-widths). Also, their purpose is generating test suites and finding (single) models that 
lead to bugs (rather than enumerating all solutions of combinatorial problems). 

8.2. Modelling Languages and Systems. General modelling systems are used for spec- 
ifying problems in corresponding modelling languages and solving them by various tech- 
niques. There are several dominating approaches for constraint programming including: 
constraint logic programming over finite domains (clp(fd); combines two declarative pro- 
gramming paradigms - logic programming and constraint solving) [37J , answer set program- 
ming (asP; a form of declarative programming with the roots in nonmonotonic reasoning, 
deductive databases and logic programming with some ASP systems using SAT solvers) [30J , 
and disjunctive logic programming [42j. There are hybrid systems that use custom spec- 
ification languages and provide support for constraint programming (e.g., IBM ILOG OPL, 
COMET, g12). Also, there are libraries for constraint programmingand combinatorial op- 
timisation for general purpose programming languages: llog solveo is a C-|— |- library for 

-1 A r , 

http: //babelfish. arc .nasa.gov/trac/jpf 
^"http : //research. microsoft . com/en-us/pro jects/pex/ 
^'http : //research. microsoft . com/en-Tis/projects/vigilaiite/| 
^' |http : / / ilog. com/products/ 1 
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constraint programming, Numberjac is a Python-based constraint satisfaction and op- 
timisation library with support for several underlying combinatorial solvers, Scala^^ is a 
SCALA library for checking satisfiability and solution enumeration with support for the SMT 
solver Z3 [41j. 

Programs in specification languages used by the modelling systems are generally not 
directly executed. Rather, they describe a problem at a high-level, descriptive way and the 
specification does not say how the problem is to be solved. In most of modelling languages 
all solving aspects are ignored (are stored only in the underlying solver) . Modelling systems 
typically use custom, different and incompatible modelling languages. There is no standard 
modelling language for constraint programming problems. A language XCSP 2.1 is an XML- 
based format to represent various CSP instances [61j. The main objective of this language is 
to ease the effort required to test and compare different algorithms by providing a common 
test-bed of constraint satisfaction instances. xcSP is already used in CSP competitions 
as an standard input format^B A high-lever language MiniZinc (a subset of a language 
Zinc [47J) also aims at becoming a standard specification language [54j- MiniZinc models 
can be translated to FlatZinc, a low-level solver input language. There are a number 
of differences between the languages xcSP, MiniZinc and URSA. xcSP representations are 
low-level, while MiniZinc and URSA representations are high-level — the former is rather 
a machine-oriented, interchange format, while the latter two provide high-level, human- 
readable specifications. In contrast to MiniZinc and URSA, xcSP has no arrays and looping 
constructs and in XCSP, for each instance, domains, variables, relations, predicates and 
constraints are exhaustively listed. 

All specification languages used in the above systems are based on some form of declara- 
tive/logic programming, while URSA uses a novel combination of declarative and imperative 
paradigms. For some problems, URSA specifications may be longer than of other systems, 
but some problems naturally expressed in URSA (for instance, problems that involve bit-wise 
operations, arithmetic modulo 2", or software verification problems that involve destructive 
assignments) are very difficult (or practically impossible for large scale instances) to express 
in other specification languages. Loop constructs are naturally expressed in URSA, while 
their absence in some declarative languages (some specification languages do admit complex 
looping structures) may cause a range of difficulties [62] 1^ The URSA language is expressive 
enough to enable substantially different encodings of the same problem, which is often not 
easy with other systems. Learning the URSA language should be trivial to someone familiar 
with some widely used imperative programming language such as C or Java, as there are 
no specific commands or fiow-controls aimed at constraint solving. Concerning the solving 
mechanism, URSA, like other systems, builds up a formula representing a constraint and then 
posts the constraint to an underlying solver. URSA's specific is that building this formula 
is completely specified by URSA's simple semantics, corresponding to binary representation 
of numbers. Concerning underlying solvers, URSA uses SAT solvers and differs from the 
above described tools that use mathematical programming, constraint logic programming, 
and other techniques (with exception of ASP systems such as Cmodels that also uses SAT 
solvers) . 

^ Ihttp ://4cll0.ucc. ie/number j ack/ 1 

^^http : //cpai .ucc . ie/09/ 
9(1 

There are recent ideas for introducing loops in constraint programming in an imperative language style 
in order to enable prototyping new constraints with less effort [19] . 
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Example 18.11 illustrates one family of problems that can be simply solved by the URSA 
system. Specifications of functions in C (analyzed if they are equivalent) can be almost in 
verbatim used within URSA, while it would be extremely difficult to represent this kind of 
problems in existing declarative programming languages — consider, for instance, complex 
functions (say, like cryptographic functions) that involve a large number of destructive 
assignments in their specifications. Additional problem is that the bitwise operators are 
not supported in other modelling systems. 

Example 8.1. One of the common problems in software verification is ensuring that two 
implementations are equivalent. The URSA system can be suitable for such tasks. Consider, 
for example, two implementations (both based on C-style bitwise operators) of the check 
that the input number n has in its representation exactly one bit set: the first (encoded by 
bl) is rather compact, while the second (encoded by b2) is more elaborated but simpler. 
URSA (for the vector length for representation of numbers equal nLen), verifies that the 
condition bl~~b2 is unsatisfiable, hence the two implementations give the same result for 
any input number, 
bl = (nv!=0) && ((nv & (nv-l))==0); 

iiLen=8 ; 
bOne = false; 
bMoreThanOne = false; 
for(ni=0; nKnLen; ni++) ■[ 

bMoreThanOne ||= bOne M ((nv & 1)!=0); 

bOne 11= ((nv k 1) !=0) ; 

nv »= 1; 



b2 = bOne && ! bMoreThanOne ; 
assert (bl"b2) ; 

8.3. Reduction to SAT and SAT-Based Constraint Solvers. There are a huge number 
of problems solved by reduction to SAT, in a range of domains (e.g., in scheduling |74j . 




termination analysis [25j, cryptanalysis [48\ I50j . model checking [12], to name just a few). 
One of the reasons for such a wide application range are tremendous advances in SAT 
technology over the last years and a number of efficient solvers. The URSA system does not 
introduce or promote one SAT solver. Rather, URSA can use any SAT solver in the solving 
phase. Moreover, it is fruitful to have, within URSA, a number of different SAT solvers, 
appropriate for different sorts of input problems, following ideas of SAT portfolio solvers 



There are several approaches for encoding CSP problems into SAT |60] . Probably the 
most popular basic types of encodings into SAT are: the sparse encoding, the compact en- 
coding, and the order encoding [34^ 166] , In the sparse encoding, a propositional variable 
Xy^i is defined as true iff the integer variable v has a value i assigned to it. Examples of 
the sparse encoding are the direct encoding and the support encoding |60j . In the compact 
encoding (or log encoding), a propositional variable is assigned to each bit of each integer 
variable (within a finite domain) [60j. In the order encoding (also known as regular encod- 
ing), related to many- valued logics and often used for the finite-domain linear problems, an 
inequality v < i is encoded by a different propositional variable for each integer variable v 
and integer value i [661 IT]. Even within one encoding style, modelling of a problem can take 
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significantly different forms [63] ■ There are a number of both theoretical and practical stud- 

ies and comparisons between different encoding schemes. Since the log encoding lacks the 
propagation power of the direct and support encodings, it typically leads to less efficient 
solutions, compared to these two [72l [28]. The order encoding gives better performance 
compared with the direct encoding and the support encoding for some CSP problems [66] . 
Within the URSA system, numerical variables are represented using binary representation 
that corresponds to the compact encoding, but still, thanks to the expressiveness of the 
specification language, other encoding styles can be simulated and used (as shown by the 
presented examples). 

The URSA system is related to the special-purpose system for transforming cryptanal- 
ysis of hash functions into the SAT problem [39j. In that approach, implementations of 
hash functions in C-|--|- were used and, by overloading the standard arithmetic and logical 
operators in C-|— |- and by running the code of the hash functions within such framework, 
propositional formulae corresponding to cryptanalysis tasks were generated (and then solved 
by a SAT solver). Similar approach was also used for cryptanalysis of des [TlJ. URSA is 
a general framework aimed not only to cryptanalysis tasks, but to a much wider range of 
problems. The framework consists of both a modelling language (instead of C+-I-) and the 
solving machinery, tightly integrated. The system is stand-alone, does not involve the C-I-+ 
language in the modelling process, and the language itself defines the modelling power of 
the approach. The related, C-I-+ based approach, has an advantage that the specification 
can be used directly as a C-I-+ code, but URSA specifications can often be used as C code 
in verbatim. 

There are several general-purpose constraint solving systems based on SAT. Thanks 
to the advances in SAT technology, such systems can be very efficient, despite the main 
weakness of this approach: a domain knowledge and a global structure of the problem are 
lost when it is reduced to the simple propositional logic. SUGARp^ is a SAT-based constraint 
solver that uses the order encoding [66]. It is focused on compiling finite linear CSP into 
SAT. SUGAR uses a syntax of CSP that is designed to cover the notation of the XCSP 2.1 
format. FZNTiNjfl is a general constraint solver that solves constraint satisfaction and op- 
timization problems (not involving floating point numbers) given in the general constraint 
language FlatZinc (typically produced from MiNlZiNC specifications) by translating them 
to SAT and calling a SAT solver Tinisat [35]. The system can be also used for translat- 
ing to SAT only (so, it can be used by independent SAT solvers). FznTini uses a fixed 
encoding directly related to binary representation of integers (the two's complement rep- 
resentation) that corresponds to the compact encoding. NPSpec is a modelling language 
for constraint problems, going with a tool Spec2Sat that compiles specifications into SAT 
instances [11]. NPSpec uses a highly declarative style of programming, similar to Datalog 
(a query language for deductive databases). The semantics of NPSpec is based on the 
model minimality, an extension of the least fixed point semantics of the Horn fragment of 
first order logic |llj . The system MXcjfl, focused on NP-hard problems, uses a modelling 
language based on classical first order logic, and for a given specification produces a propo- 
sitional formula (and passes it to a SAT solver) [51] 158] . The system can also translate 
problem specification to SAT extended with cardinality constraints. 

91 

http: //bach. istc .kobe-u. ac . jp/sugar/ 
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Apart from systems that can translate problem specifications to SAT, there are also 
systems that translate such specifications to smt. For instance, fzn2smt [5] translates from 
the FlatZinc language and the system simply |4j translates from a declarative modelling 
language similar to, but simpler than MiniZinc. Both systems translate to the standard 
SMT-LIB forma10 and can use various underlying SMT theories and available SMT solvers. 

The system URSA and the above systems share the underlying solving technology, but 
their specification languages are very different. In contrast to URSA, all of the above tools use 
declarative languages, and they don't have features of imperative languages (e.g., destructive 
assignments), as discussed in Section 18. 2[ Also, the languages used by the above tools 
provide support for various operators and global constraints, but typically do not support 
bit-wise operators and constraints involving modular arithmetic, which can be essential in 
many cases. This way, the input languages restrict the tools from having the full power 
of modelling in propositional logic. Concerning expressiveness, SUGAR and FznTini use 
rather simple, low-level specification language without flow control structures and other 
features of programming languages that URSA has. Using representation based on binary 
representation of integers is very similar in FznTini and URSA (with a minor difference that 
FznTini uses the two's complement representation). 



8.4. Experimental Comparison. It is very difficult to make a fair and thorough compar- 
ison of the modelling systems: they were not built with the same motivation and purposes; 
it is not only performance that is important but also expressiveness and ease of acquiring a 
modelling language; some systems perform better on some sorts of problems (and worse on 
the other); a single problem can be often specified in different ways; most systems do not 
share the same input language (even if they do, some types of specifications may be better 
suited to some systems); the leading systems are under ongoing developments, and there 
are new features being added constantly; there are new emerging systems, etc. Still, with 
all of the above cautious, several (a very few, likely due to the above difficulties) existing 
reports give some general picture of efficiency of systems for constraint solving over finite 
domains [2A\ \^3\ |2T] while new insights are provided by a number of system competitions 
recently initiated. 

For experimental comparison with the URSA system, we used the following state-of- 
the-art systems (and versions}, including several industrial ones: constraint logic program- 
ming systems B-Prolog 7.43 and SICStus 4.1.10 (with the clpdf module), a deductive 
database system (using Disjunctive Datalog language that combines databases and logic 
programming ) DL\0 [42j, ASP solvers (used with Smodels specifications and a grounder 
Lparse 1.1.2) CLASP 1.3.fl [17], Cmodels 3.7S [31J and Smodels 2.343 [65], hybrid op- 
timization system (using a custom object-oriented programming language) comet 2.1.]|^ 
a hybrid optimization system (using a custom declarative modelling language) IBM 



^ http : //www . smtlib . org/ 1 



24 

^"http : //www.probp. com/ 
^'http : //www. sics . se/isl/sicstuswww/site/J 
^'http : //www. dbai .tuwien. ac . at/proj/dlv 
^^http : //www. cs .uni-potsdam. de/clasp/ 
^'http : //www. cs .utexas . edu/~ tag/ cmode ls/] 
^•■http : //www . tcs . hut .fi/Sof tware/ smodels/ 1 



^ ^http : // dynadec . com 



URSA: A SYSTEM FOR UNIFORM REDUCTION TO SAT 



27 



ILOG GPL 6.30, a default finite-domain g12/fd solver Mercury (using MiniZinc as its 
input language) from MiniZinc 1.1. 1@ [M], and SAT-based systems SUGAR 1.14.6 and FzN- 
The URSA system was used with clasp 1.2.0 as an underlying SAT solver. 
We performed experimental comparison between all of the above tools on a prototyp- 
ical GSP problem — the queens problem (that involves different sorts of constraints), and 
additional comparison between the systems that performed the best in the first phase. Most 
of the used specifications are given as part of the system distributions; they are typically 
straightforward for their systems and symmetry breaking conditions or some other addi- 
tional constraints were not used. Experiments were performed on a PC computer with Intel 
Celeron 420 1.60Ghz, 1Gb RAM and the time threshold for finding all solutions was 600s. 
The measured times are reported by the systems themselves or by adding the spent "user" 
and "system" time0 

The Queens problem. Table [5] shows results of experiments for the queens problem for 
instance sizes from 8 to 14. The URSA system was used with the specification queens-2 
(see page I19p . The most efficient system on this benchmark was B-Prolgg, followed by 
SICStus, and then by URSA, that had similar performance as g12/fd and clasp. It is 
interesting to notice that the URSA system using clasp as an underlying sat solver was 
around the same efficiency as the clasp solver used as ASP solver, which suggests that the 
reduction to SAT used by URSA is very efficient. 



N 


solutions 


Cmodels 


Smodels 


DLV 


OPL 


COMET 


CLASP 


G12/FD 


URSA 


SICStus 


B-PROLOG 


8 


92 


0.11 


0.09 


0.04 


0.08 


0.02 


0.02 


0.03 


0.08 


0.01 


0.01 


9 


352 


0.75 


0.35 


0.15 


0.30 


0.09 


0.07 


0.08 


0.16 


0.03 


0.01 


10 


724 


7.42 


1.41 


0.45 


0.63 


0.34 


0.22 


0.22 


0.37 


0.09 


0.02 


11 


2680 


132.20 


10.51 


2.31 


2.28 


1.64 


0.89 


1.12 


1.18 


0.50 


0.10 


12 


14200 


>600 


44.37 


12.67 


12.56 


9.34 


4.51 


5.91 


4.92 


2.53 


0.45 


13 


73712 


>600 


331.54 


83.36 


62.89 


48.73 


25.50 


31.48 


26.47 


14.36 


2.85 


14 


365596 


>600 


>600 


479.91 


301.88 


246.90 


190.48 


188.43 


164.19 


76.89 


17.64 



Table 5: Results of experimental comparison of ten tools (including ursa) applied on the 
N queens problem for A*" = 8, . . . , 14 

The above result don't include the systems that translate problem specifications to SAT 
and which are the systems closest to URSA. Namely, these systems translate inputs to SAT 
(so it can be considered that they share the solving mechanism), but they use different SAT 
solvers. A fair comparison would be thus to use these systems only as translators to SAT and 
then use the same SAT solver (for instance, clasp) for finding all models of the generated SAT 
formulae. It is interesting to consider size of generated formulae and solving times (of course, 
smaller formulae does not necessarily lead to shorter solving times). FznTini was used with 
FlatZinc specifications obtained from the MiniZinc specification used by g12/fd (with 

^' http : //www- 01 . ibm. com/sof tware| 
^''www.gl2 . cs .mu. oz . au7niinizinc/ 

^*We didn't include the systems MXG and NPSpec/Spec2Sat in this evaluation: MXG was not publicly 
available and NPSpec/Spec2Sat was not maintained since 2005, and also its reported performance |10] . 
especially for the SAT formulae generation phase, is significantly poorer than of URSA and other considered 
systems. 

^^For some recommendations on how to benchmark constraint solving systems visit 
|http: //www.dbai . tuwien. ac . at/proj/dlv/bench/. 
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integers encoded with 5 bits) and with FlatZinc specifications obtained from a MiniZinc 
specification made in the style of the direct encoding^ we will denote them by 1 and 2. 
SUGAR was used only with a specification that employs the order encoding. URSA was used 
with the specifications queens- 1 (with integers encoded with 5 bits), queens-2 (with the 
number of bits equal the instance dimension), and queens-3, (with integers encoded with 4 
bits). Table [6] presents the obtained experimental results. All recorded times were obtained 
for the "quiet" mode of the SAT solver (without printing the models) . Times for generating 
formulae were negligible (compared to the solving phase) for all systems, so we don't report 
them here. 

For related specifications, URSA's queens- 1 gave much smaller formulae (probably 
thanks to techniques mentioned in Section 16. 2p and somewhat better performance than 
FznTini 1, which suggests that FznTini does not benefit much from information about 
the global structure of the problem. The formulae generated by SUGAR were significantly 
smaller than in the above two cases, and led to much better solving efficiency. However, 
it was outperformed by the remaining entrants. The URSA's specifications queens-2 and 
queens-3 gave similar results. The specification queens-3 produced formulae with the 
smallest number of clauses. FznTini 2 produced formulae with the smallest number of 
variables. The best results in terms of the solving times were obtained also by FznTini 2. 
It can be concluded that URSA can produce, with suitable problem specifications, proposi- 
tional formulae comparable in size and in solving times with formulae produced by related 
state-of-the-art systems. 

Additional Experiments. In additional experiments, only the systems that performed the 
best on the queens problem were considered (with only one constraint logic programming 
system kept): B-PROLOG, CLASP, FznTini, g12/fd, and ursa. The following problems 
were considered (for all problems all solutions were sought): 

Golomb Ruler: The problem (actually, one of its variation) is as follows, given a value L 
check if there are m numbers oq, ai, . . ., am-i such that = ao < oi < ... < Om-i = L 
and all the m{m — l)/2 differences aj — Ui, < i < j < m are distinct (problem 6 
at CSPlitF^. The experiments were performed for m = 5,..., 11, with the largest L 
that make the problem unsatisfiable and with the smallest L that make the problem 
satisfiable. 

Magic Square: A magic square of order N is a N x N matrix containing the numbers 
from to N"^ — 1, with each row, column and main diagonal equal the same sum. The 
problem is to find all magic squares of order (problem 19 at CSPlib). The experiments 
were performed for = 3 and A^ = 4. 

Linear Recurrence Relations: Linear homogeneous recurrence relations of degree k 
are of the form: Tn-^-k+i = CLkTn+k + . . . + aiTn+i for n < 0. Given Ti, . . . , Tfc and 
n, Tn can be simply calculated, but finding explicit formula for Tj requires solving a 
nonlinear characteristic equation of degree k, which is not always possible. So, the 
following problem is nontrivial: given Ti, . . ., Tfc_i and T„, find T^. For the experiment, 
we used the relation T„+3 = Tn+2 + ^n+i + Tn, Ti = T2 = I. We generated instances 
with the (only) solution T3 = 1 and the systems were required to seek all possible values 
for T^. Additional constraints (used explicitly or implicitly) for all considered systems) 

"^^Therefore, these translations to sat are rather by two systems: the MiniZinc to FlatZinc converter 
and FznTini. 

'^' http: //www, csplib . org/| 
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Dimension 


8 


9 


10 


11 


12 


13 


14 


FznTini 1: 


variables 


3012 


3825 


4735 


5742 


6846 


8047 


9345 


clauses 


9128 


11628 


1 i KiO 


17567 


21000 


21713 


2«770 


all solutions 


0.15 


0.79 


3.20 


14.53 


111.78 


>600 


>600 


URSA (quccns-1): 


variables 


841 


1052 


1286 


1560 


1819 


2139 


2468 


clauses 


3352 


4217 


5179 


6295 


7390 


8712 


10089 


all solutions 


0.08 


0.53 


2.83 


17.60 


98.04 


>600 


>600 


SUGAR: 


vaxiables 


220 


284 


356 


436 


524 


620 


724 


clauses 


1138 


1653 


2253 


3012 


3924 


5003 


6263 


all solutions 


0.02 


0.06 


0.31 


1.58 


9.59 


68.07 


411.15 


URSA (queens-2): 


variables 


542 


739 


978 


1263 


1598 


1987 


2434 


clauses 


3319 


5008 


7280 


10258 


14077 


18884 


24838 


all solutions 


0.01 


0.04 


0.12 


0.70 


4.01 


23.17 


138.55 


URSA (queens-3): 


variables 


176 


225 


280 


341 


408 


481 


560 


clauses 


800 


1110 


1490 


1947 


2488 


3120 


3850 


all solutions 


0.01 


0.03 


0.12 


0.69 


3.82 


21.09 


136.45 


FznTini 2: 


variables 


128 


162 


200 


242 


288 


338 


392 


clauses 


872 


1236 


1690 


2244 


2908 


3692 


4606 


all solutions 


0.01 


0.02 


0.07 


0.33 


1.52 


8.39 


52.12 



Table 6: Data on SAT formulae generated by FznTini, SUGAR, and URSA for the N queens 
problem for A'' = 8, . . . , 14 and solved by the CLASP SAT solver 

were Tj > and Ti < Tn, ioi i = 1, ... ,n. URSA and FznTini were used with 32 bit 
length for numerical values. 

Non-linear recurrence relations: In non-linear homogenous recurrence relations of de- 
gree k, the link between Tn+k and r„_|_fc_i, Tn+k-2, Tn, is not necessarily linear. 
For the experiment we used the relation Tn+4 = Tn+s • Tn+2 + ^n+i, Ti = T2 = 1. We 
generated instances with the (only) solution T3 = 1 and the systems were required to 
seek all possible values for T^. Additional constraints (used explicitly of implicitly) for all 
considered systems) were Tj > and Tj < T^, for z = 1, . . . , n. For all problem instances, 
the size of all relevant values Tj were smaller than 2^^. URSA and FznTini were used 
with 32 bit length for numerical values. 

The URSA was used with the following specification for the Golomb ruler problem (for the 

instance m = 7, L = 25): 

nM=7; 
nL=25 ; 

bRulerEndpoints = niim2bool (nRuler & nRuler » nL & 1); 

nMarks=2 ; 

bDistanceDif f =true ; 
for(ni=l; ni<=nL-l; ni++) { 

nMarks += (nRuler » ni) & 1; 

n = (nRuler & (nRuler « ni)); 

bDistanceDif f &&= (n & (n-l))==0; 
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> 

assert_all(bRulerEndpoints && nMarks==nM && bDistanceDiff ) ; 

The above specification employs a binary representation of the ruler (nRuler) where 
each bit set denotes a mark. The value nRuler & nRuler » nL & 1 equals 1 if and only 
if the first and nLth bit are set (as the ruler endpoints). The value nMarks counts the bits 
set (e.g., the marks) and it should equal nM. If the ruler nRuler is a Golomb ruler, then 
whenever it is shifted left (for values 1, . . ., nL-l) and bitwise conjunction is performed with 
the original ruler giving the value n, there will be at most one bit set in n (since all the 
differences between the marks are distinct). There is at most one bit set in n if and only 
if the value n & (n-1) equals 0. This specification, employing a single loop, illustrate the 
expressive power of bitwise operations supported in the URSA language. 

For the magic square problem, URSA was used with the following specification: 
iiDim=4; 
rLN=nDim*rLDim ; 

bCorrectSim = (2*rLSiim*nDim == nN*(nN-l)); 

bDomain=true ; 
bDistiiict=true ; 
f or(rLi=0;ni<nDim;ni++) {. 
f or(nj=0;nj<nDim;nj++) { 

bDomain &&= (nT[ni] [nj]<nN) ; 
for (iLk=0 ; nk<iLDim ; nk++ ) 
for (nl=0 ; nKnDim ; nl++) 

bDistinct &&= ((ni==ni && nj==nl) || nT[ni] [nj] !=nT[iik] [nl] ) ; 

} 

} 

bS"iim=true ; 
nSuml=0; 
nSum2=0 ; 

f or (iii=0;ni<nDim;ni++) { 
nSuml += nT [ni] [ni] ; 
nSum2 += nT[ni] [nDim-ni-l] ; 

nSuin3=0 ; 
nSum4=0 ; 

f or (iij=0;nj<iiDim;nj++) { 
nSumS += nTCni] [nj] ; 
nSum4 += nT[nj] [ni] ; 

> 

bSum &&= (nSuni3==nSum) ; 
bSum &&= (nSuni4==nSvmi) ; 

} 

bSum &&= (nSuiiil==nSum) ; 
bSum &&= (nSum2==nSum) ; 

assert_all(bCorrectSum && bDomain && bDistinct && bSum) ; 

For the linear recursive relation, URSA was used with the following specification (the 
specification for the non-recursive relations is analogous): 

n =30; 

ny = 20603361; 
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nl=l; 
n2=l; 
n3=nx ; 

for(ni=4; ni <= n; ni++) { 
ntmp=nl+n2+n3 ; 
nl=n2; 
n2=n3 ; 
n3=ntmp ; 

bDomain &&= (n3<=ny) ; 

> 

assert_all (bDomain kk n3==ny) ; 

Table [7] shows experimental results (with translation times included). FznTini was 
used with clasp as an underlying solver (the built-in solver gave poorer results). The 
number of variables and clauses generated by URSA were, in these benchmarks, smaller 
than for FznTini. For lparse/clasp, the translation time was significant, and some of 
the poor results for some benchmarks are due to large domains (while the system works 
with relations rather than functions). For recurrence relations, g12/fd reported model 
inconsistency when it approached its limit for integers, while B-prolog just failed to find 
a solution for larger instances. Overall, on this set of benchmarks, URSA gave better results 
than CLASP and FznTini and on some benchmarks outperformed all other tools. 

Discussion. The described limited experiments cannot give definite conclusions or ranking 
of the considered systems, as discussed above. In particular, one may raise the following 
concerns, that can be confronted with the following arguments: 

• URSA was used with a good problem specifications, and there may be specifications for 
other systems that lead to better efficiency. However, almost all specifications were taken 
from the system distributions, given there to illustrate the modelling and solving power 
of the systems. Also, the problem specifications used for URSA are also probably not 
the best possible, but are rather straightforward, as specifications for other systems. In 
addition, in contrast to URSA, other modelling systems typically aim at liberating the 
user of thinking of details of internal representations and are free to perform internal 
reformulations of the input problem. Making specifications in URSA may be somewhat 
more demanding than for some other systems, but gives to the user a fuller control of 
problem representation. 

• Some specifications used for URSA are related to the direct encoding (known to be effi- 
cient, for example, for the queens problem), while this is not the case with other systems. 
What is suitable for SAT-based systems is not necessarily suitable for other systems. For 
instance, g12/fd gives significantly poorer results with the specification of the queens 
problem based on the direct encoding, than with the one used in the experiment. This 
is not surprising, because systems that are not based on SAT does not necessarily handle 
efficiently large numbers of Boolean variables and constraints (in contrast to SAT solvers) 
and lessons from the SAT world (e.g., that for some sort of problems, some encoding 
scheme is the most efficient) cannot be a priori applied to other solving paradigms. 

• URSA uses bit-wise logical operators, while other systems do not ( as they don't have support 
for them). Bit- wise logical operators make one of advantages of URSA, while in the same 
time, some other systems use their good weapons (e.g., global constraints such as all- 
different). 
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Dimension 


B-PROLOG 


CLASP 


FznTini 


G12/FD 


URSA 


Golomb ruler 


5/10 


0.01 


5.36 


0.20 


0.06 


0.01 


6/16 


0.01 


44.68 


1.16 


0.08 


0.02 


7/24 


0.01 


350.11 


9.53 


0.10 


0.10 


8/33 


0.08 


>600 


111.90 


0.24 


0.69 


9/43 


0.69 


>600 


>600 


1.18 


4.89 


10/54 


5.34 


>600 


>600 


7.84 


35.55 


11/71 


105.54 


>600 


>600 


93.2 


571.90 


5/11 


0.01 


6.33 


0.32 


0.07 


0.01 


6/17 


0.01 


57.40 


1.43 


0.08 


0.01 


7/25 


n.oi 


429.98 


14.15 


0.10 


0.13 


8/:i 1 


0.09 


>G()() 


10().2() 


0.2() 


0.87 


9/44 


0.78 


>600 


>600 


1.27 


5.87 


10/55 


6.86 


>600 


>600 


6.44 


37.28 


11/ 72 


115.81 


>G()() 


>()()() 


12"). 50 


150.41 


Magic square 


3 


0.01 


0.05 


0.04 


0.01 


0.01 


1 


1.71 


1G2.21 


XiOO 


10. 2() 


1)3.01 


Linear recurrence relations 


4 


0.00 


0.01 


0.01 


0.01 


0.00 


5 


0.00 


0.06 


0.01 


0.02 


0.00 


6 


0.00 


1.18 


0.01 


0.02 


0.01 


7 


0.00 


25.49 


0.01 


0.02 


0.01 














28 


43.83 


>600 


0.17 


143.54 


0.31 


29 


84.92 


>600 


0.68 


incous 


0.33 


:!() 


15cS.78 


>()()() 


1.02 


iiicoiis 


0.47 


Non-linear recurrence relations 


4 


0.00 


0.00 


0.01 


0.01 


0.00 


5 


0.00 


0.01 


0.01 


0.01 


0.02 


6 


0.00 


0.42 


0.22 


0.02 


0.05 


7 


0.00 


126.05 


0.36 


0.02 


0.08 


8 


0.00 


>600 


0.51 


0.02 


0.13 


9 


fail 


>600 


0.76 


0.02 


0.28 


10 


fail 


>600 


0.88 


0.03 


0.53 


11 


fail 


>600 


0.97 


incons 


0.77 



Table 7: Results of experimental comparison of five tools 



In summary, the presented experiments suggest that the URSA system (althoTigh it is 
not primarily a CSP solver but a general system for reducing problems to SAt) is competitive 
to the state-of-the-art, both academic and industrial, modelling systems — even if they can 
encode high-level structural information about the input problem and even if they involve 
specialized underlying solvers (such as support for global constraints like all-different). A 
wider and deeper comparison between these (and some other) constraint solvers (not sharing 
input language) and with different encodings of considered problems, would give a better 
overall picture but is out of scope of this paper. 
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9. Future Work 

The current system (with the presented semantics and the corresponding implementation) 
uses one way (binary representation) for representing (unsigned) integers but (as shown in 
the given examples), it still enables using different encoding styles in specifications. For 
further convenience, we are planning to natively support other representations for integers, 
so the user could choose among several encodings. Also, signed integers and floating point 
numbers could be supported. 

The language URSA (and the interpreter) can be extended by new language constructs 
(e.g., by division). A new form of the assert can be added, such that it propagates interme- 
diate solutions to subsequent commands of the same sort. Support for global constraints can 
also be developed, but primarily only as a ,, syntactic sugar" — the user could express global 
constraints more easily, but internally they would be expanded as if they were expressed 
using loops (i.e., as in the current version of the system). Alternative forms of support for 
global constraints would require substantial changes in the SAT-reduction mechanism. 

On the lower algorithmic and implementation level, we are planning to further improve 
the current version of transformation to CNF. 

In the current version, ground integers are represented by built-in fixed-precision in- 
tegers, which is typically sufficient. However, in order to match symbolic integers, ground 
integers should be represented by arbitrary (but fixed) length integers and we are planning 
to implement that. 

Concerning the underlying SAT solvers, currently only two complete SAT solvers are 
used. We are planning to integrate additional solvers, since some solvers are better suited 
to some sorts of input instances, as the SAT competitions show. Within this direction of 
work, we will also analyze performance of stochastic solvers within URSA. In addition, 
we are exploring potentials of using non-CNF SAT solvers |67l [53l [38] within URSA, which 
avoids the need for transformation to CNF [69]. Choosing among available solvers can be 
automated by using machine learning techniques for analysis of the generated SAT instances 
(or even input specifications) |73[ I55j. For solving optimization problems, instead of the 
existing naive implementation, we are planning to implement more advanced approaches 
and to explore the use of MaxS AT and pseudo-Boolean solvers [3J . 

On the theoretical side, the full operational semantics outlined in this paper can be 
formally defined and it could be proved that solutions produced by the URSA system indeed 
meet the specifications and if there are no produced solutions, then the specifications is 
inconsistent. Along with the formal verification (i.e., verification within a proof assistant) 
of the SAT solver ArgoSAT used |45[ I44j. that would provide a formal correctness proof of 
the URSA system (which would make it, probably, the first trusted constraint solver). 

In the presented version of URSA, reducing to SAT is tightly integrated (and defined by 
the semantics of the system) in the program execution phase. An alternative would be as 
follows: during the program execution phase, a first-order formula is generated and only 
before the solving phase it is translated to a propositional formula. Moreover, the generated 
formula would not need to be translated to a propositional formula, but could be tested for 
satisfiability by using smt (satisfiability modulo theory) solvers (e.g., for linear arithmetic, 
equality theory, alldifferent theory etc.) [2]. In particular, symbolic computations employed 
by the URSA system are closely related to the theory of bit-vector arithmetic and to decision 
procedures for this theory based on "bit-blasting" [8], [9]. Since solvers for this theory 
typically cover all the operators used in URSA, the theory of bit-vector arithmetic can be 
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used as an underlying theory (instead of propositional logic) and any solver for bit-vectors 
arithmetic can be used as a solving engine. Generally, the reduction could be adaptable to 
SMT solvers available — if some solver is available, then its power can be used, otherwise all 
generated constraints are exported to propositional logic. This would make the approach 
more powerful and such development is the subject of our current work — the system URSA 
MAJOR will be able to reduce constraints not only to SAT but to different smt theories. That 
system will be a general constraint solver and also a high-level front-end to the low level 
SMT-LIB interchange format, and, further, to all SMT solvers that supports it. Reduction to 
the theory of bit-vector arithmetic is firstly explored in this context [l6] and it shows that 
reducing to bit-vector arithmetic does not necessarily lead to more efficient solving process 
than reducing to SAT (and the same holds for other smt theories). 

With the increased power of the presented system (by using both SAT and SMT solvers), 
we are planning to further consider a wide range of combinatorial, NP-complete problems, 
and potential one-way functions and also to apply the URSA system to real-world problems 
(e.g., the ones that are already being solved by translating them to sat). Some applications 
in synthesis of programs are already the subject of our current work. 

For the sake of easier practical usability of the URSA system, we are planning to develop 
a support for integration of URSA with popular imperative languages (C, C++, Java). 

10. Conclusions 

In this paper we described a novel approach for uniform representation and solving of a 
wide class of problems by reducing them to SAT. The approach relies on: 

• a novel specification language that combines features of imperative and declarative lan- 
guages — a problem is specified by a test, expressed in an imperative form, that some 
values indeed make a solution to the problem; 

• symbolic computations over unknowns represented by (finite) vectors of propositional 
formulae. 

The approach is general and elegant, with precisely defined syntax and induced (by 
the concept of symbolic execution) semantics of the specification language. This enables 
straightforward implementation of the system and it works as a "clear box" . The proposed 
language is a novel mixture of imperative and declarative paradigms, leading to a new 
programming paradigm. Thanks to the language's declarative aspects — the problem is 
described by what makes a solution and not by describing how to find it — using the 
system does not require human expertise on the specific problem being solved. On the other 
hand, specifications are written in imperative form and this gives the following advantages 
compared to other modelling languages (all of them are declarative): 

• problem specifications can involve destructive assignments, which is not possible in declar- 
ative languages and this can be essential for many sorts of problems (e.g., from software 
verification) ; 

• modelling problems that naturally involve loops (and nested loops) is simple and the 
translation is straightforward; 

• for users familiar with imperative programming paradigm, it should be trivial to acquire 
the specification language URSA (since there are no specific commands or flow-controls 
aimed at constraint solving); 
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• the user has a fuller control of internal representation of the problem, so can influence 
the efficiency of the solving phase. 

• a specification can be taken, almost as-is, from and to languages such as C (within C, 
such code would check if some given concrete values are indeed a solution of the problem) . 

• the system can smoothly extend imperative languages like C/C++ or Java, (as constraint 
programming extends logic programming). 

• the system can be verified to be correct in a straightforward manner. 

In addition, the system URSA, in contrast to most of (or all) other modelling systems, 
supports bit-wise logical operators and constraints involving modular arithmetic (for the 
base of the form 2"), which is essential for many applications, and can also enable efficient 
problem representation and problem solving. 

The proposed approach can be used for solving all problems with Boolean and numerical 
unknowns, over Boolean parameters and numerical parameters with finite domains that can 
be stated in the specification language (e.g., the domain of the system is precisely determined 
by expressiveness of its specification language). 

The search for required solutions of the given problem is performed by modern SAT 
solvers that implement very efficient techniques not directly applicable to other domains. 
While SAT is already used for solving a wide range of various problems, the proposed system 
makes these reductions much easier and can replace a range of problem-specific tools for 
translating problems to SAT. The tool URSA can be used not only as a powerful general 
problem solver (for finite domains), but also as a tool for generating SAT benchmarks (both 
satisfiable and unsatisfiable) 

Concerning weaknesses, URSA is not suitable for problems where knowledge of the do- 
main and problem structure are critical and can be efficiently tackled only by specialized 
solvers, and this holds for reduction to SAT generally. Due to its nature, by interfacing URSA 
with standard specification languages like XCSP or MiniZinc, most of the URSA modelling 
features and power would be lost (e.g., bit-wise logical operators and destructive updates), 
while global constraints supported by these languages would be translated in an inefficient 
way. Therefore, it makes no much sense to enable conversion from these standard languages 
to URSA and this makes URSA a bit isolated system in the world of constraint solvers or 
related systems. 

In this paper we do not propose: 

• a new SAT-encoding technique — rather, the URSA specification language can be used for 
different encoding styles; 

• a technique for transforming a SAT formula to conjunctive normal form — this step is 
not a part of the core of the URSA language and is not covered by its semantics, so any 
approach (meeting the simple specification) can be used; the current technique seem to 
work well in practice, while it can still be a subject of improvements. 

• a SAT solver — rather, URSA can use any SAT solver (that can generate all models for 
satisfiable input formulae); moreover, having several SAT solver would be beneficial, since 
some solvers are better suited to some sorts of problems. 

In this paper we also described the system URSA that implements the proposed ap- 
proach and provided some experimental results and comparison with related systems. They 

38a 

large number of SAT instances obtained from cryptanalysis of des [71] (by using sim- 
pler version of the system proposed in this paper), were part of the SAT Competition corpus 
(http: //www. cril .univ-artois .f r/SAT09/ solvers/booklet .pdf ) . 
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suggest that, although URSA is not primarily a CSP solver (but a general system for reduc- 
ing problems to sat), the system is, concerning efficiency, competitive to state-of-the-art 
academic and industrial CSP tools. URSA is also competitive to other system that translate 
problem specifications to SAT. In contrast to most of other constraint solvers, the system 
URSA is open-source and publicly available on the Internet. 

For future work, we are planning to extend the system so it can use not only complete 
SAT solvers, but also stochastic SAT solvers, non-CNF solvers, and SMT solvers. We will also 
work on formal (machine-checkable by a proof assistant) verification of the system (i.e., 
show that the URSA system always gives correct results) and on extensions of the system 
relevant for practical applications. 
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